ANN-based Innovative Segmentation Method for Handwritten text in Assamese
نویسندگان
چکیده
Artificial Neural Network (ANN) s has widely been used for recognition of optically scanned character, which partially emulates human thinking in the domain of the Artificial Intelligence. But prior to recognition, it is necessary to segment the character from the text to sentences, words etc. Segmentation of words into individual letters has been one of the major problems in handwriting recognition. Despite several successful works all over the work, development of such tools in specific languages is still an ongoing process especially in the Indian context. This work explores the application of ANN as an aid to segmentation of handwritten characters in Assamesean important language in the North Eastern part of India. The work explores the performance difference obtained in applying an ANN-based dynamic segmentation algorithm compared to projectionbased static segmentation. The algorithm involves, first training of an ANN with individual handwritten characters recorded from different individuals. Handwritten sentences are separated out from text using a static segmentation method. From the segmented line, individual characters are separated out by first over segmenting the entire line. Each of the segments thus obtained, next, is fed to the trained ANN. The point of segmentation at which the ANN recognizes a segment or a combination of several segments to be similar to a handwritten character, a segmentation boundary for the character is assumed to exist and segmentation performed. The segmented character is next compared to the best available match and the segmentation boundary confirmed.
منابع مشابه
Bi-lingual Handwritten Character and Numeral Recognition using Multi-Dimensional Recurrent Neural Networks (MDRNN)
The key to the continued success of ANN depends, considerably, on the use of hybrid structures implemented on cooperative frame-works. Hybrid architectures provide the ability to the ANN to validate heterogeneous learning paradigms. This work describes the implementation of a set of Distributed and Hybrid ANN models for Character Recognition applied to Anglo-Assamese scripts. The objective is t...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملA Dataset of Online Handwritten Assamese Characters
This paper describes the Tezpur University dataset of online handwritten Assamese characters. The online data acquisition process involves the capturing of data as the text is written on a digitizer with an electronic pen. A sensor picks up the pen-tip movements, as well as pen-up/pen-down switching. The dataset contains 8,235 isolated online handwritten Assamese characters. Preliminary results...
متن کاملA Zoning based Feature Extraction method for Recognition of Handwritten Assamese Characters
This paper introduces a novel feature extraction approach for handwritten Assamese character recognition. The performance of an optical character recognition system highly depends on the extracted feature set. Hence, feature extraction plays a significant role in achieving high recognition accuracy. Also, not all the features of an image are useful for classification and therefore feature extra...
متن کاملUsing an artificial neural network approach for off-line sentence segmentation
This paper works with an Artificial Neural Network (ANN) architecture to segment unconstrained English handwriting sentences into single words. The ANN receives a feature set of the handwritten text line and classifies each image’s column belonging to a word or a gap between words. As result, the sequences of columns with the same classification represent the segmented words or inter-word gaps....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/0911.0907 شماره
صفحات -
تاریخ انتشار 2009